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SHOULD OPTIMAL DESIGNERS WORRY ABOUT CONSIDERATION? 


MINHUA LONG AND W. ROSS MORROW 


Abstract. Consideration set formation using non-compensatory screening rules is a vi¬ 
tal component of real purchasing decisions with decades of experimental validation. Mar¬ 
keters have recently developed statistical methods that can estimate quantitative choice 
models that include consideration set formation via non-compensatory screening rules. 
But is capturing consideration within models of choice important for design? This paper 
reports on a simulation study of a vehicle portfolio design when households screen over 
vehicle body style built to explore the importance of capturing consideration rules for op¬ 
timal designers. We generate synthetic market share data, fit a variety of discrete choice 
models to the data, and then optimize design decisions using the estimated models. Model 
predictive power, design “error”, and profitability relative to ideal profits are compared as 
the amount of market data available increases. We find that even when estimated com¬ 
pensatory models provide relatively good predictive accuracy, they can lead to sub-optimal 
design decisions when the population uses consideration behavior; convergence of com¬ 
pensatory models to non-compensatory behavior is likely to require unrealistic amounts of 
data; and modeling heterogeneity in non-compensatory screening is more valuable than 
heterogeneity in compensatory trade-offs. This supports the claim that designers should 
carefully identify consideration behaviors before optimizing product portfolios. We also 
find that higher model predictive power does not necessarily imply better design decisions; 
that is, different model forms can provide “descriptive” rather than “predictive” informa¬ 
tion that is useful for design. 


1. INTRODUCTION 

Conventional discrete choice models [1, 2] have been applied in design for market systems 
[3, 4, 5, 6, 7, 8, 9, 10] in the past decade. Generally, the choice model serves to forecast 
demand as a function of product features, thus enabling design decisions that maximize 
forecast profits. These conventional choice models share the assumption that individu¬ 
als choose by processing and weighing all attributes, for all alternatives, when maximiz¬ 
ing utility. According to this assumption choice is a compensatory decision making pro¬ 
cess where tradeoffs can take place across all features and all alternatives: in particular, 
shortcomings in one attribute can always be compensated by making others sufficiently 
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attractive. Empirical studies have shown the opposite: people often use “fast and fru¬ 
gal” non-compensatory rules to eliminate options when faced with task complexity [11], 
time pressure [12], information cost [13] and memory requirements [14]. The use of such 
heuristics—decision rules that ignore information—is widespread and beneficial [15]. This 
paper investigates the importance of including consideration behavior when making design 
decisions. 

The awareness of the use of non-compensatory rules among consumers has changed 
the traditional concept of the choice set in choice modeling [16]. Instead of assuming 
only a universal choice set with all alternatives, consideration-sets [17, 18] have become a 
topic of active research. Consideration-sets are subsets of the universal set that are cho¬ 
sen by individuals following internal, non-compensatory rules. Building on early research 
on non-compensatory decision models [19, 20, 21], non-compensatory rules proposed for 
consideration set formation include conjunctive, disjunctive, subset conjunctive, and even 
lexicographic rules; see [22] for further background and examples. Accepting considera¬ 
tion implies that identification of the structure and distribution of screening rules is an 
important empirical task that much recent research addresses, as reviewed in Section 2. 

But is modeling consideration important when making design decisions? Marketers have 
only shown the advantage of modeling consideration through improvement in model predic¬ 
tive accuracy, though this has been accomplished across a wide variety of product categories 
including cameras, batteries, automobiles, cellphones, and computers [23, 24, 18, 25, 26]; 
e.g., see Table 1 below. Simulation experiments have illustrated the limits of classical com¬ 
pensatory models including the multinomial and random coefficient (Mixed) Logit models 
when modeling non-compensatory choice behavior [27, 28, 29]. Existing engineering stud¬ 
ies demonstrate how design can include consideration in choice model structure, and how 
this might affect decisions [30, 31], but have not compared the performance of compen¬ 
satory and non-compensatory models when both types of models are estimated on the 
same data with a comparable level of system knowledge. Even if compensatory models 
do not represent non-compensatory choice behavior well, could they still suggest product 
designs similar to designs that are optimal for true, non-compensatory behavior? If the 
non-compensatory behavior is modeled directly, how much closer could a firm get to true 
optimal designs? What is the difference of the value of the chosen designs, e.g. profits, 
between designs chosen using compensatory versus non-compensatory models? 

We describe a simulation study that examines how well compensatory models perform 
in 1) recovering non-compensatory choice behavior, 2) suggesting design decisions near to 
ideal optimal decisions, and 3) suggesting designs that capture all potential profitability. 
Our “synthetic data” [32] simulation experiment has the following steps: 


1:: Define a synthetic population with known “true” choice behavior; 

2:: Simulate responses of this population to a sequence of “markets” with randomly 
generated product profiles; 

3:: Estimate compensatory and non-compensatory models from the responses and 
validate predictive power; 
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4:: Optimize design decisions with the estimated models and evaluate design profit 
using the “true” behavior. 

Synthetic data experiments are an effective method for detecting choice model properties 
in specific situations or when testing the validity of an estimation approach [29, 33, 34, 32]. 
We extend this paradigm to also include the quality and value of decisions made using 
estimated models, the ultimate goal of choice modeling within engineering design. The 
synthetic data experiment allows us to measure the divergence of design decisions and 
outcomes from ideal values that can be obtained only by knowing the true behavioral 
model. We describe an “econometric-style” (revealed preference) experiment that uses 
aggregate share data to estimate choice models. An alternative perspective, more common 
in marketing, samples the population for respondents to choice and/or consideration-based 
conjoint surveys (stated preference). Both perspectives have value, as is discussed in [2], 
pg.152. Both types of models have also been used in design [4, 35]. 

Several observations are enabled by the experiment. As would be expected, modeling 
consideration with a non-compensatory model results in the best design and pricing de¬ 
cisions when the population exhibits matching non-compensatory behavior. Conventional 
compensatory models can reasonably support profitable design decisions, however, with 
several caveats: conventional models might require more data than is reasonably available 
to capture non-compensatory behaviors, can suggest simplistic product portfolios, can be 
sensitive to sample variance in the training data, and don’t forecast the value of design 
decisions well even if those decisions couldn’t be improved with a better model. Over¬ 
all, modeling heterogeneity in the screening rules used to form consideration sets captures 
more value to design than modeling heterogeneity in the compensatory stage. A similar 
observation has been made by Andrews et al. [29]. Finally, while assuming that better 
model predictive power implies better design decisions is reasonable, it is not necessarily 
true: models with lower predictive power can suggest more profitable designs. We hope 
our case study will motivate market systems researchers to further examine what consider¬ 
ation behaviors exist in their product categories and how these behaviors might influence 
optimality of chosen designs. 

The rest of paper is organized as follows: Section 2 reviews the consider-then-choose 
model construction and estimation studied in marketing research. Section 3 describes the 
simulation framework and synthetic data generation process. Section 4 and 5 respectively 
provides details of model estimation and design optimization. Section 6 presents our results, 
followed by discussion in Section 7. Section 8 concludes. 

2. CONSIDER-THEN-CHOOSE MODELS 

A consider-then-choose model can be described as follows. Suppose the universal choice 
set is J = {1,..., J}. A consideration set indexed by r = 1,..., R, denoted as C r C {1,..., J} 
is defined by a set of screening rules s r = [s r> i,..., s ri z, r ]. For conjunctive rules, C r can be 
written as: 


(1) 


Cr(X,p) = { j e {!,..., J} : Sr{*j,Pj) < 0 | 
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The screening rules depend on product features Xj, price pj as well as other rule-specific 
parameters. This definition means that a product needs to satisfy all the screening rules 
to be a member in the corresponding consideration set. For example, the consideration set 


C r (X, p) ={all vehicles j with price pj under $20,000 
AND fuel economy ej over 30 mpg} 


can be defined by 


x 


v a _ ( Pj 20,000 
j,Pj) \ 30 — e,- 


< 0. 


This structure is consistent with the forms used in the marketing literature, although 
marketers often define screening rules in terms of indicators instead of inequalities. See, 
for example [23, 36, 32], These representations can be transformed into the structure 
presented here. 

Given a collection of screening rules and the associated consideration set, let the condi¬ 
tional probability that product j is chosen within the set be Pj\c r an d let the probability 
that the consideration set C r is formed be Pc r ■ Then the choice probability Pj can be 
written as a weighted sum of the choice probabilities across all possible consideration sets: 


( 2 ) 


Pi=Y. P i \CrPC, 


Hauser [22] calls such models “consideration” or “choice set explosion” models, as they are 
subject to combinatorial explosion in the number of parameters needed to capture consid¬ 
eration set occurrence. Empirical methods estimate Pc r directly, rather than uncovering 
structure behind screening by identifying the rules s r . Manrai and Andrews [37] provide 
a thorough review of studies applying Eqn. (2) to scanner panel data. Note that Eqn. (2) 
can also be considered a type of random coefficients (Mixed) Logit model, though not one 
with normally distributed coefficients. This structure has also been found to be similar to 
a nested Logit model, as we detail in Sec. 4.5 below. 

Preference-conditional choice probabilities then take the following form: 


(3) P j | Cr (X,p|0) = 

where utility u(.) is a function of product characteristics Xj and price pj given coefficients 
6 that measure preferences. The utility coefficients 6 can be assumed to be homogeneous 
across the population or take a random coefficients form to include heterogeneity (which 
requires a Monte-Carlo integral of the simple Logit form above). This formula can, in 
principle, be extended to capture heterogeneity across consideration sets by allowing a 
nontrivial joint distribution between coefficients 9 and consideration sets. 

Methods used in early studies to discover non-compensatory screening rules included 
tracing and protocol analysis [11, 38] in which respondents’ decision processes were self- 
reported or tracked. Shortcomings of this type of method have been reported and include 


e u(Xj, Pj ,U) 


1 + EkeC r eU{xk ' Pk ' 0) lfj&Cr 
0 if j i C r 
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Table 1. Recent consider-then-choose models constructed from stated 
preference data, compared to compensatory models estimated in the same 
study. Abbreviations: HB - Hierachical Bayes; MLE - Maximum Likelihood; 
HR - “hit rate” (frequency of correct prediction on hold-out samples); KLD 
- Kullback-Liebler Divergence; TAU - Kendall’s Tau [42]. 


% Improvement 


Reference 

Product 

Compensatory Model 

Consider-then-Choose Model 

HR 

KLD 

TAU 

[33] 

Laptops 

LP Logit 

Greedy, Lexicographic 



0% 

[24] 

Batteries 

MLE Logit 

MLE, Subset conjunctive 

1.1% 



[18] 

Cameras 

HB Logit 

HB, Conjunctive screening 

7.1% 



[26] 

Smartphones 

HB Ranked Logit 

Lexicographic by aspects 

8.7% 



[25] 

Cellphones 

HB Logit 

Unstructured Direct Elicitation 


9.1% 


[43] 

Rental Cars 

MLE Logit 

“Cut-off rules” (conjunctions) 


14.0% (a) 


[44] 

GPS Units 

HB Logit 

Greedy, Lexicographic 

4.5% 

54.5% 


[32] 

Vehicles 

HB Logit 

Adaptive question HB 

44.1% 

16.7% 



(a) [43] characterized improvement with improvement in log-likelihood, which is proportional to KLD. 


inconsistencies between the stated screening criteria and observed choices from the same in¬ 
dividual [39] . More recent research shows that the accuracy of direct elicitation approaches 
can be improved by designing experiments that are incentive-compatible: for example, by 
participating in a survey in which respondents describe their screening rules for new vehi¬ 
cles, they have a decent chance of actually winning a vehicle described [25]. Estimation tools 
that are widely applied in traditional discrete choice analysis, e.g. maximum likelihood 
and Bayesian methods, can also be used in non-compensatory model parameter estimation 
[40, 24, 23]. These methods may, however, suffer from high computation costs due to ex¬ 
ponential growth in the number of possible consideration sets as the number of attributes 
and/or attribute levels grows. Machine learning techniques have recently been adapted to 
circumvent this problem by applying greedoid methods [26, 33] or low-dimensional param- 
eterizations of screening rule likelihood [41, 32], Broadly speaking, marketing research has 
demonstrated predictive power improvement by modeling consideration; see Table 1. 

In principle the specification reviewed above can be estimated from choice data with 
classical tools such as Maximum Likelihood Estimation (MLE) and Bayesian methods. To 
facilitate this, the representation of consideration set probability Pc r has taken different 
forms. Swait [45] introduced a random component into the screening rules so that with an 
assumed distribution Pc r can be derived based on the probability any alternative is accept¬ 
able. Ben-Akiva [46] extended this random consideration set generation model by specifying 
the availability probability as Logit form. Instead of defining Pc,, through parameterized 
screening rules, Chiang et al. [47] assumed consideration set probabilities have a Dirichlet 
distribution across the population. Gilbride and Allenby [23] avoided the enumeration of 
consideration sets by using a reduced form choice probability and Markov Chain Monte- 
Carlo methods to sample from the posterior distribution of the allowable screening criteria 
values. Exponential growth in the number of possible consideration sets and rules makes 
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consider-then-choose models difficult to estimate in practice. This challenge motivated re¬ 
searchers to develop methods that apply to “consider” stage observations to infer screening 
rules with more attributes and complexity. For example, MLE methods have been used 
on “acceptable/unacceptable” responses to the profiles to estimate the probability that a 
particular attribute level is acceptable [24], Dzyabura & Hauser [32] model a case where 
capturing the distribution of screening rules would require 2 53 parameters, too many for a 
direct estimation strategy. They develop an adaptive question survey strategy to estimate 
conjunctive screening rules by parameterizing screening rule likelihood presuming feature 
acceptability is independent, obtaining a model with only 53 parameters per respondent. 


3. CASE STUDY: VEHICLE DESIGN UNDER BODY STYLE SCREENING 

We simulate a stylized model of the new vehicle market with potential purchasers that 
screen over vehicle body style. Empirical studies have shown body style screening in 
both self-reported surveys [48] and statistical inferences [32], Body style also significantly 
impacts the engineering relationships between other features in vehicle design. We often 
refer to the synthetic behavior described below as the “true” behavior. We do not use this 
terminology to suggest this is how households actually choose new vehicles. This is only a 
shorthand appropriate for the context of the simulation experiment. 


3.1. Synthetic Behavior. Our population is a mix of groups that screen over B = 9 
vehicle body styles listed in Table 2. Vehicles are described by fuel economy (e), acceleration 
(a), price (p) and a B-element binary vector 5 for which 5b = 1 if, and only if, the vehicle 
has body style b (thus = !)• Let s / 0 be a B-element binary vector defining which 

body styles are “acceptable” to a given individual in the population; we refer to these 
vectors succinctly as “screening rules.” Unlike 6, which can have only one element equal 
to 1, s can have any number of elements equal to 1. An individual with screening rule s 
considers only those vehicles with body styles b such that Sb = 1 or, equivalently, s T 5 > 1. 
In the notation of Eqn. (1,3) we can index individuals by screening rules s and define 

(4) C S (A) = G {1,..., J} : 1 — s T <5j < 0 j 

where A = (<5i,. .., 5j) is a matrix of binary body style vectors. 

The fraction of individuals in the population with a particular screening rule s is given 
by a probability mass function a(s) drawn from the results of the empirical study reported 
in [41, 32]. This study estimated conjunctive screening rules with a Bayesian adaptive 
question method for 874 respondents. More specifically, we take a to be the empirical 
frequency distribution (over respondents) of the modal (most probable) rules s for the 
posterior distribution. Out of the 874 respondents, 219 distinct most-probable conjunctive 
screening rules were estimated, and every body style is acceptable to some individual. See 
Table 2 for the aggregated acceptability of different body styles over the full respondent 
pool. 
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Table 2. Percentage of respondents accepting the given body style, as 
reported in [32], 
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Table 3. Means and Variances of random coefficients (0’s) in synthetic 
population utility function, Eqn. (5). A/"(/2, <r) refers to a normally dis¬ 
tributed variable with mean fi and variance a 2 . Values based on the model 
from [49]. 


Attribute 


Utility 

Random Coefficient 
Mean (/i) Variance (<r) 

Price 

ip) 

- exp{jV(fi,, a)jp 

2.0 

0.1 

Fuel Economy 

(a) 

JV(i,,a)/e 

-36.8 

2.2 

Acceleration 

(c) 

cf)/a 

11.3 

0.3 

Constant 

(-) 


-23.2 

0.5 


Conditional on using a screening rule s, individuals choose from those vehicles in C S (A) 
by maximizing the random utility: 

(5) Uj = u(ej , aj , pj ; 0 ) + £j 

0Q 

( 6 ) u(e, a,p; 0) = - exp{9 p }p + — + — + 9 0 

e a 

for random coefficients 9i ~ A f(£ii,(Ji) (l = p,e,a, 0). Ais the normal distribution 
with mean ft and variance a. The exponential in the price coefficient ensures that lower 
prices are preferred, all other things being equal. The errors £ = (£q,£i, ■ ■ ■ ,£j) are i.i.d. 
extreme value variables mean-shifted towards zero. The resulting screening-conditional 
sub-populations thus follow a Mixed Logit model. There are no correlations between 
screening rules and preference over vehicle attributes and price, but there is heterogeneity 
in the population. 

3.2. Market Share Simulation. We simulate sales data set that might be collected from 
multiple, separate new vehicle markets. The vehicles in each market form a “universal” 
choice set for the consumers in that market. A data set for estimation then consists 
of vehicle market shares in M separate markets indexed by m. For each market, we 
draw a set of J m vehicles, denoted J m - The profile of vehicle j in market m is given by 
drawing fuel economy (ej. m ), acceleration (a J;m ), price ( Pj , m ) and body style (b J: m) from 
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a uniform distribution respectively on intervals [5,50] (mpg), [2,15] (s) and [10,60] (10k$) 
and {1 An alternative consistent with our optimal design problem presented 

below would be to draw sets of vehicles that satisfy our assumed design constraints. This 
is possible, and better matches the stylized market modeling paradigm we employ. However 
random draws are likely to give us better information about choice behavior than correlated 
draws, and thus allow us to focus more completely on choice model quality. To investigate 
statistical properties of model estimation and use with stochastic data generation and 
choice outcomes, this process is repeated with different random seeds. 

Given product profiles in market m , we draw N m choice observations in which individ¬ 
uals can purchase one of the vehicles or choose not to purchase any vehicle (choose the 
“outside good”). N m individuals are drawn from the synthetic population by drawing N m 
screening rules s* from the distribution ct(s) along with associated random coefficients 0,. 
Shares Sj tTn for each vehicle j in each market m (and the outside good) are then gener¬ 
ated by maximizing random utilities (utilities plus error term) foreach individual over their 
consideration set. See [50] for an explicit algorithm. 


4. CHOICE MODELS 

We examine four choice model specifications: Multinomial Logit (MNL), Random Coef¬ 
ficients Logit (RCL), Nested Multinomial Logit (NML) and Consider-Then-Choose Logit 
(CTC) models. We assume that all models incorporate the prior information that body 
style plays a role in consumer decision, but different model specifications incorporate this 
piece of information in different structures: MNL and RCL model assume the tradeoffs 
between body style and other attributes, NML uses nests that separate body styles, thus 
constructing a two-stage but yet compensatory process; CTC models the frequency of any 
possible consideration sets, based on body style, along with compensatory choices con¬ 
ditional on consideration set. Note that the true behavior of the synthetic population 
exhibits characteristics of both non-compensatory screening and heterogeneity in compen¬ 
satory stage. Thus all the models are misspecified on at least one behavioral feature. The 
comparison between these models will thus illustrate the consequence of failing to capture 
different behavioral features. 

Coefficients in all models are estimated by maximizing the log-likelihood with respect 
to the coefficients [2]. For a general choice model with probabilities Pj ) 7 n {d) for coefficients 
0, this takes the form: 

M 

m=ljeJmU{0} 

0 (plus possible constraints) 

We abbreviate this process by “MLE” and, for brevity, do not explicitly list each MLE 
problem below. Instead we define the choice probability model and list any constraints 
imposed on the coefficients as this is sufficient to recreate our process. 


(7) 


maximize 


with respect to 
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4.1. Multinomial Logit Model (MNL). The MNL model takes the utility of product 
j to be 


( 8 ) 


MNL 


( 0 ) 


sax a. 9e , °a 

= - exp {Qp}pj, m H-1- 




B 

+ "y ] dfydj ,m,b + $0 
6=1 


giving choice probabilities 

(9) P™n NL (d) 


e*P«m (®)} 


1 + E 


keJn 


ex P { u k,m L ( e )} 


As with the true behavior, the “exp” term in the price coefficient ensures that the price 
coefficient is negative, and thus lower prices are preferred (all other attributes being equal). 
The “no buy” or outside good probability is Pq^ l (Q) = 1 — EjeJ m Pjni L (P)- The body 
style coefficients are not independently identified from the constant 6q because body styles 
are represented by dummies that always sum to one. We use “effects coding” [51] and 
constrain the coefficients over body styles to sum to zero. We could, equivalently, leave 
the constant term or one of the body style dummies out of the specification. We prefer to 
include the constant and all dummies with effects coding as this allows us to capture only 
body style specific variations in utility with the coefficients on the body style dummies. 


4.2. Random Coefficients Logit Model (RCL). In the RCL model, choice probabili¬ 
ties Pf:^ L are defined for each vehicle j in each market m by 

(10) P^m L (P- a ) = J Pjfm L W)<t>( e I /T cr)d6 

for Pj^ L as given in Eqn. (9). All random coefficients as written in Eqns. (9-10) are 
assumed to be normally distributed, 8i = N(m,ai) l = p,a,e, 1, ...,R, with mean pj and 
variance a'j. Note, however, that this implies that the price coefficient will be log-normal 
(e.g., [52, 53]). The density cf) is thus a product of 4 + R independent normal densities each 
having two parameters. The RCL model thus has 8 + 2 B coefficients we must estimate, 
two of which enter into the utility function nonlinearly. 

Given synthetic revealed preference data we estimate the parameters (jl, a ) using sim¬ 
ulated MLE [2], We perform Monte-Carlo sampling over random coefficients to obtain I 
samples 6i ~ jV(/j, diag(er 2 )) and simulated RCL choice probabilities 

(H) Pj% L (P , CT ) = (y) P j?m L (0i)- 

' ' i =1 

We use I = 1,000 Monte-Carlo samples throughout this study if not otherwise mentioned. 
Similar to the Logit model estimation, the perfect correlation between body style coeffi¬ 
cients can lead to multiple estimators that give the same choice probability. Therefore the 
mean coefficients on body styles are also constrained so that their sum equals to zero. Note 
that this does not imply that = 0 with probability one, but only that E[£) 6 0&] = 0. 
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4.3. Nested Multinomial Logit Model (NML). We also examine a NML model in 
which vehicles with the same body style are assigned to the same nest. Suppose product 
j in market m belongs to nest A4(j),m where b(j) is the body style of product j. The 
probability product j is chosen in market m is 

( 12 ) P j!m L { e ) = P j\b(j),m P b(j),m 

where P^ m is the probability that any product from nest Mb is chosen in market m and 
Pj\ b U) rn ^ ie probability that product j is chosen in market m , conditional on nest b(j) 
being chosen. Pj\f,(j) m follows the logit formula in which only non-body style features are 
involved in the utility: 


(13) 


pC 


( 6 ) 


ex P {u?m L (6 p ,6 e ,6 p )} 


J,m 


DfceA/L,) m eX P{<m ( 0 P> 6 'e, 6 <»)} 


with utility within the nest defined as: 

(14) u™ L (9 p ,e e ,e a ) = 


exp {6 p }p jt m + 



&j,m 





The choice of nest depends on the “nest utility” 


(15) 


Vb,m{9p, 0e, 6a) = fog j ^ eX P i u f,m L (6p, 6 e , 6 a )} 
\^j GAT b,m 


and also takes the logit form 
(16) P b N m(6,X) = 


exp{0o + 6b + \bVb t m{6p , 6 e , 6 a )} 

1 4" 52c =l ex P{^o + 6 C + \ c V c ,m(9p, 9 e , 9 a )} 


We again constraint the body style dummies coefficient to sum to zero, for the same reason 
as in MNL and RCL models. 

This formulation follows Daly’s version of the NML [54], rather than the “Generalized 
Extreme Value” formulation given by McFadden [55]. The difference between two formu¬ 
lations is that McFadden’s model uses 


9q + 6b + \Vb,m 


(9p_ 9e_ 9a 

VV V A fe 


as the utility in Eqn. (16). This change is required for consistency with random utility 
maximization, but there is still debate about whether that is essential in the model [2]. 
Both versions have similarities with consideration behavior, as discussed below. 


4.4. Consider-Then-Choose Logit Model (CTC). In the CTC model body styles are 
screened in the non-compensatory stage and do not enter the compensatory stage. Pref¬ 
erence in compensatory stage is assumed to be homogeneous both among the population 
and across all consideration sets. The body styles screening rules are S = (si, ...,s r) char¬ 
acterizing all R = 2 b — 1 possible consideration sets Ci,...,Cr, except the “null set” in 
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which no body style is considered. Each screening rule is coded as B-element binary vector 
s r = (sy.i, s r! B) where s Tt b = 1 if body style b is acceptable, s r ^ = 0 otherwise. Thus 

(17) Cr, m = {j:aJS jtm >l} 

which means that a product will be considered as long as its body style is acceptable. 

The choice probability for product j in market m is 


R 


(18) 


P^ c (0, a ) = Y J 




exp 


r =1 


i + E 


fcec r 


eX Pi' U fc.m C ( 6 ')} 


if j E C r>m and zero otherwise, where ay = a(s r ) is an estimator of the probability that a 
randomly drawn individual in the population has screening rule s r and utilities are defined 
by: 


(19) 


CTC 

l j,m 


(9) = - exp {9 p }pj >m + + 6>, 




l j.rn 


We estimate this model with MLE, solving for both 9 and a E [ 0 , 1 ], Er ay = 1. These 
constraints are required to ensure that a is a probability mass function. 

Note that we directly estimate consideration set probability Pc r = a(s r ) rather than 
estimating parameters of the distribution a(s). Our case study is small enough to enable us 
to enumerate the consideration sets, requiring only R = 511 a values to fully characterize 
the distribution of consideration sets. This formulation allows us to estimate from the 
same observed market share data using a MLE technique consistent with that employed 
for the MNL, RCL, and NML models. The CTC model we estimate does not, however, 
then reflect the level of generality and efficiency available in the applications we review 
above. This does not affect our main purpose, to demonstrate the impact on design of 
non-compensatory behavior. 


4.5. Connecting Nested Logit with Consideration. A few comments regarding the 
connection between the NML and CTC models are required, motivated by the similarity in 
the choice probabilities in the CTC and NML models. If the consideration sets used in the 
population are disjoint, then Eqn. (2) describes the choice probabilities in a single-level 
nested Logit model whose nests are given by the consideration sets. However a NML would 
use the specific parameterization of Pc r given in Eqn. (16). It is easy to see that any true 
value of Pc r can be recovered in this parameterization by taking the nesting parameter 
A r to be 1 and choosing the right value of the coefficients for attributes that are constant 
over the consideration set (e.g., body style). Swait [56] has linked the generalized nested 
logit model to construct a general consideration set explosion model. Similarly, it is also 
easy to show that a single-level cross-nested Logit model [57] can realize choice processes 
as described in Eqn. (2). 

Though these models are the mathematically similar their interpretations differ, which 
drives a non-trivial difference in formalization. The NML model pictures rational, com¬ 
pensatory individuals that might use any consideration set and choose any product, and 
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models consideration set frequency as a function of the expected maximum utility of choos¬ 
ing from a given consideration set [2]. While a NML can recover CTC behavior choosing the 
right parameters, it does not necessarily result in the same predictions as designs change 
because nest selection is a function of in-nest utilities. In contrast, the CTC model views 
individuals as drawn from a population with heterogeneous screening rules, and decouples 
consideration set frequency from utility. When consideration set occurrence is, completely 
or partially, independent of compensatory utilities, this distinction is meaningful. 


5. DESIGN OPTIMIZATION 

This section defines a single firm’s optimal vehicle design problem matching the stylized 
market model discussed above. The firm’s objective is to maximize the expected profit of 
its vehicle portfolio by deciding the number of vehicles Jj and choosing the body style, fuel 
economy, acceleration, and price for each vehicle. We allow firms to offer multiple vehicles 
of the same body style, as this is observed in real auto markets. 


5.1. Engineering Model. Each vehicle is described by its 0-60 acceleration time (a, in 
s), fuel economy (e, in mpg), weight (w, 10 3 lbs), body style (b G {1,..., B}), “technology 
content” (f, unitless), and price (p, $10 4 ). In the original model by [49], t is a continu¬ 
ous proxy for efficiency improvement through adoption of discrete technology content; this 
efficiency improvement can be directed towards either fuel economy or acceleration perfor¬ 
mance. To accomplish this, and to represent a physical connection between acceleration 
and fuel economy, e,a,w and t are related by a function gb(e,a,t) as given in Eqn. (20): 

9b(e, a, t ) = - 10 ^ 6 - /3 9 const (b) - f3 9 (b) exp{-a} - /3f (b)t 
~ PZt( b )a 2 t ~ PUb)w - P 9 wa {b)wa 


Eqn. (20) can be written as an equality constraint gb(e,a,t ) = 0 on acceleration and fuel 
economy decisions. Unit costs are also a function of design variables expressed by the 
following function: 

, , c 6 (e, «) = Pconstib) + Pa(b) exp{—a} + 

+ Pw(b)w + P^ a (b)wa 


the body style specific coefficients in these models were estimated using detailed engineering 
simulations from AVL Cruise in conjunction with confidential technology production cost 
data provided to NHTSA by automakers in advance of the 2012 — 2016 fuel economy rule 
making [49]. Table ?? and ?? summarizes body style specific coefficient values. In our case 
study we assume that vehicle weight and technology content for each vehicle are fixed, and 
thus do not include these as arguments in g\, or q,. We use the average curb weights listed 
in Table ??, based on 2005 model year vehicle data as reported in [58], and technology 
content 4 = 20. 
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5.2. Formulation and Solution. Given a portfolio with Jf vehicles and body style vector 
b the optimal choices of fuel economy, acceleration, and price for each vehicle are those 
that solve 


( 22 ) 


maximize 
with respect to 


subject to 

where expected profits are 


vr(p, e ,a| J/,b) 

Vj, Pj > 0 

— G — U e 

La,b(j) — a j — U a ,b(j)i 
Sb(j){^ji ®j) 0 ^3 


Jf 

(23) ir(p,e,&\J f ,b) = ^ Pj(p, e, a, b)( Pj - Cj (ej,aj)) 

3 = 1 


and (L e ^, U et b), (L a> b, U a ,b) are body-style specific lower and upper bounds on fuel economy 
and acceleration. Note that we are not specific about what probability model we use. 
Eqn. (22) is smooth for any of the models, because choosing prices, fuel economy, and 
acceleration does not affect screening in the CTC or, similarly, the nesting structure in 
NML. 

The optimal number of vehicles, body styles, and associated designs and prices can be 
obtained by solving 

maximize n*(Jf,b) 

(24) with respect to Jf G {1,..., B}, 

bj G {1,..., B} for all j = 1,..., Jf 

where n*(Jf,b) is the optimal value of Eqn. (22) for given Jf and b. Note that we allow 
for multiple vehicles with the same body style. Because enumerating all all the feasible 
choices of body styles b for vehicles is computationally prohibitive, we use a 

Genetic Algorithm (GA) to solve Eqn. (24); our scheme is described in Appendix ??. 


6. RESULTS 

This section presents performance results pertaining to choice model accuracy or predic¬ 
tive power, design “error”, and profitability potential. To investigate how the amount of 
market information influences performance, we performed the simulation experiment with 
M = 10, 25, 50,100, 200, 500, and 1000 markets. For each M we draw 20 separate sets of 
Jm = 5 profiles and N m = 100 choice observations, estimate MNL, RCL, NML, and CTC 
models, and then use these models to design product portfolios obtaining 20 separate sets 
of model estimates and designs. Sampling different sets of share data for a given market 
size allows us to gauge the effect of sampling variance in the data on model predictions, 
design outcomes, and design value, while examining different numbers of markets allows us 
to assess the asymptotic properties of the estimated models and their associated designs. 
The MLE and design optimization routines were programmed in C language, and nonlinear 
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programs involved in estimation and design optimization were solved with the sequential 
quadratic programming (SQP) solver SNOPT (version 7) [59]. All computations were un¬ 
dertaken on a single Mac Pro tower with 2, quad-core 2.26GHz processors and 32GB of 
RAM running OS X (10.6.8). 


6.1. Predictive Power. The predictive power of the estimated models is validated on a 
new data set that consists of M' markets, where each market m' = 1, has a set of 

vehicles J m i. Kullback-Leibler Divergence [60], 


(25) 


KLD = 


1 

M 7 


M' 

X 

m' = 1 j£j 7 


X 


p 


j,m' 



captures how close the predicted choice probability distribution P is to the actual choice 
probability distribution P T in the validation set. Predictive share errors are evaluated via 
Eqn. (25) using 1,000 markets of validation data different from the estimation data, but 
drawn using the same approach. 

Fig. 1 plots the divergence between predictions for estimated models and the true be¬ 
havior against the number of markets used to train the models. Increasing the amount of 
market data available for estimation reduces both expected prediction error and the vari¬ 
ance of the error. Increasing the amount of data, however, does not result in traditional 
compensatory models that match the predictive power of the CTC model. For example, 
the divergence of the three traditional models’ predictions observing 1000 markets is larger 
than the CTC prediction observing only 10 markets. When observing more than 50 mar¬ 
kets, the predictive power of RCL and NML models is generally between those of MNL and 
CTC with RCL predictions appearing to be slightly closer to the true behavior. However, 
when observing fewer than 10 markets the MNL model outperforms RCL and NML mod¬ 
els. We believe designers should be particularly interested in performance when estimating 
models with relatively small amount of market data because real revealed preference mar¬ 
ket research often uses a very limited number of markets for estimation. For example, 
econometric new vehicle market models most often use fewer than 20 markets (marked 
with vertical line in Fig. 1) [61, 62, 63, 2]. Our market simulation setting is not strictly 
comparable to these studies because of a difference in the number of vehicle-observations 
in each market, the complexity of real vehicle profiles, and the detail often given by popu¬ 
lation demographics. But these results suggest caution given the small number of markets 
usually used for model estimation in practice. 


6.2. Decision Bias and Variance. We first define a “design error” metric to quantify 
how different portfolios chosen using an estimated model are from portfolios that would 
be chosen for the true behavior (perfect information). Comparing product portfolios is a 
complicated task, and we do not suggest we have a uniquely good metric for comparison. 
Essentially, our metric compares the relative difference in specific vehicle attributes — 
excluding price to focus on engineering aspects of strategy — for the same styles of vehicles 
and number of vehicles with different body styles. The specific numerical values of “design 
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Figure 1. Kullback-Leibler Divergence (KLD) of predicted choice prob¬ 
ability distribution from true behavior. Solid lines represent the range of 
observed values over 20 separate data sets while the dashed line represents 
the average value. 


error” for any given choice model are less important than comparisons across the different 
choice model types we explore. 

Suppose a portfolio has Jf vehicles, each with body style bj and design Xj = ( ej,aj ). 
Denote the body style combinations of a portfolio as (m, ri 2 , ■■■, ub) where n b is the number 
of vehicles in the portfolio that have body style b. We refer to the ideal portfolio as an 
estimate of the globally optimal portfolio with the true behavior and denote the ideal 
portfolio with superscript multiplicity of ideal portfolios is addressed below. Our 
design error metric is 


(26) 

where 

(27) 

(28) 

(29) 

(30) 


B 


d = - y] Nb + max{77 + , H } 


b =1 


N b = 


\n b -n* b \/n* b if n b > 0 


n b 


if nl = 0 


H + = max < min < cL,(x 7 ,xt) 

r-< U ) >0 \ ™*(k)=b{j) [ 


H = max < min < d„,(x 7 -, xt) 
r-n b{k) > o r.b{j)=b*{k ) 


, . 1 (\e — e*\ la — a* 

dw (x, x = - - 1 + 1 -— 

2 V e* a* 


The first term in Eqn. (26), Eqn. (27), captures differences in body style combinations by 
penalizing differences in the number of vehicles offered with each body style. The second 
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Number of Markets Number of Markets 


Figure 2. Design error, measured as in Eqn. (26), for all models over all 
numbers of markets observed. Solid lines represent the range of observed 
values over 20 separate data sets while the dashed line represents the average 
value. 


term in Eqn. (26), composed of Eqns. (28-30), is a Hausdorff distance [64] comparing sets 
of vehicles with the same body styles using the relative error metric in Eqn. (30). This 
portion of the metric is zero so long as the sets of vehicles offered are equivalent, even if 
offered in different multiplicities. If nonzero, this portion gives the relative error in the 
attributes of any vehicle offered when that vehicle shares a body style with a body style 
offered in the ideally optimal portfolio. Note that this distance measure gives an error only 
in engineering decisions, while pricing is obviously important to profitable product design. 
However it is plausible that “incorrect” prices could be corrected relatively quickly in the 
marketplace after offering a particular set of products, while errors in engineering features 
cannot be. Section 6.4 explores this in more detail. See [50] for a generalization of Eqn. 
(26)-(30) that accounts for prices. 

Fig. 2 plots design error for optimal decisions based on the MNL, RCL, NML, and CTC 
choice models against number of markets observed. Two features are of interest: design 
bias refers to the difference between mean model-optimal designs and ideal optimal designs; 
design variance refers to the spread of designs that might be made given different observed 
markets used to estimate the choice models. With fewer than 25 markets the CTC model 
has the lowest design bias, consistent with the performance this model showed in predictive 
power. As the amount of market data grows, designs under the CTC model appear to be 
converging to ideal designs. Designs chosen using a NML model compare well to CTC 
designs when observing 50 or more markets. Designs chosen using a RCL model have the 
largest variation among the models, and this variation cannot be overcome by increasing 
the number of markets observed. Unlike RCL, NML and CTC models, using a MNL model 
suggests identical vehicles with the same body style should be produced; this is reflected as 
the high design error compared to the ideal design in which there is wide diversity among 
body styles. 
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In computing design error we presume that a computed ideal portfolio is a good repre¬ 
sentation of portfolios required to achieve global optimality relative to the true behavior. 
If there are distinct locally optimal portfolios that achieve nearly globally optimal profits a 
different metric would be required. Similarly, if profits were “flat” near the ideal portfolio 
design error loses meaning. In the next section we discuss decision profitability, which is 
the ultimate metric of portfolio performance. 

However, an important issue in our use of this metric pertains to whether our “ideal” 
optimal designs are a good representation of the global optimum. Given suitable evolution 
rules, GAs can explore the entire design space and thus global optimality is guaranteed for 
discrete variable problems, but only because the design space can be enumerated; global 
optimality cannot be guaranteed in finite time, with probability one, for continuous variable 
problems. In our case, we applied a GA to a relatively small, B(B + 1) = 90 binary variable 
problem, evolving populations for 50-100 generations (until the best and average fitness 
values converged), and used 50 trial-multistart. Given the small size of design space, this 
process is very likely to obtain good solutions. We found a high degree of consistency in 
both the body type combinations obtained by the GA and the design and price for each 
body type found by the NLP; in terms of our own design metric, the optimal solutions 
found by multistart differed by less than 10 -6 . Thus we have high confidence that our 
computed ideal optimal designs are as good as could be practically obtained. 

6.3. Decision Profitability. A decision that differs from an ideal decision is not neces¬ 
sarily un-profitable. The true profit of a product portfolio is its expected profits computed 
under the true behavior, rather than the estimated model. We compute true profit n as 

(31) n = ^-P/(e,a,p, A)(pj - Cj) 

i =i 

computing market share Pj by computing the choice probability in the true behavior 
described in Section 3.1 instead of the sampling procedure described in Section 3.2. Here the 
Monte-Carlo sampling size used to approximate the compensatory stage random coefficients 
model was I = 100,000. We do not include competitive firms’ vehicles in this profit 
validation in order to be consistent with the design optimization problems, which do not 
include competition. Note that true profit for any model-optimal portfolio should be less 
than the true profit given by the optimal portfolio under the true behavior. We refer the 
the profit gained by optimal decisions under the true model as the ideal profit. 

Fig. 3 plots the percentage of ideal profits that can be achieved by choosing designs and 
prices using an estimated model. CTC designs and prices are, not surprisingly, best able 
to capture true profits. Even when observing only 10 markets, CTC designs and prices can 
be expected to achieved at least 90% of the ideal profits with the worst designs achieving 
around 70% of the ideal profits. MNL designs and prices can be expected to obtain only 
60% of the ideal profit due to a single body-style portfolio that lacks diversity. Estimating 
a model with limited amount of market data appears to affect the profitability of the RCL 
model designs the most out of all the models, consistent with our observations regarding 
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design error variance. Even observing up to 50 markets it is possible for RCL designs and 
prices to recover less than 40% of the ideal profits, depending on sampling variance in 
the market data. More data results in RCL designs and prices within 90-95% of the ideal 
profits. Observing less than 25 markets NML designs capture approximately 20% less true 
profit than the CTC designs and also shows relatively high variation in true profits (facts 
obscured by the logiO-scale axis in Fig. 3). However like RCL, NML designs and prices 
can ultimately capture 90-95% of the true ideal profits when estimating the model with 
enough data. 

6.4. Pricing-On-Offering. An additional test assesses the degree to which a model sug¬ 
gests unprofitable decisions simply because of a poor representation of preferences over 
prices. Prices can, in principle, be changed up until the point-of-sale while design decisions 
must often be fixed far in advance of sale. Thus it is reasonable to consider a case where 
firms learn more about preferences when they offer the portfolio designed and exercise 
price flexibility to maximize profits. Suppose that the MNL, RCL, NML, and CTC models 
inform the design of the product portfolio but that prices can be changed even the product 
is offered (as in, e.g., [65]). How much more profits could the firm recover by using the true 
choice behavior in order to set optimal prices, for fixed designs? While the firm is not likely 
to actually know the true behavior, this value represents an upper bound on profitability 
of design decisions made using an estimated model when prices are flexible and determined 
when offering the portfolio. 

Fig. 4 plots percent of ideal profits obtained using the vehicle portfolios suggested by 
the estimated models, but offered at prices determined by the optimizing profits for that 
portfolio under the true behavior. From this perspective the RCL, NML, and CTC models 
each have the potential to suggest nearly equivalently profitable design decisions. RCL 
and NML, in particular, can suggest much more profitable portfolios if we admit pricing 
flexibility than if we dont, and thus RCL and NML capture pricing preferences more weakly 
than does the CTC model. Moreover, for intermediate numbers of markets (25, 50, 100, 
and 200), the NML model appears to suggest the most profitable portfolios by a small 
margin (less than 2.7%) that is exaggerated by the logio axis scaling. Finally, even the 
best possible pricing strategy cannot increase the true profitability of the single body-style 
portfolio designed under the MNL model. 

7. DISCUSSION AND LIMITATIONS 

There are several observations for designers to take away from this exploratory simulation 
study. 

First, conventional compensatory models can reasonably support profitable design de¬ 
cisions even when the population exhibits non-compensatory behavior with enough data. 
Designs based on estimated RCL and NML models were capable of obtaining above 90% 
of the ideal profits (Fig. 3); however this required roughly twice the amount of market 
data (50 markets) that might typically be available (20 markets) judging from the vehicle 
modeling literature. The RCL and NML models could suggest designs that obtain almost 
100% when an ideal pricing strategy is followed (Fig. 4), but this would require learning 
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Figure 3. Percent of ideal prof¬ 
its obtained by designs and prices 
under true behavior recovery when 
choosing designs and prices with 
estimated models. Note the log 10 
scale y-axis focuses on differences 
from 100%. Solid lines represent 
the range of observed values over 
20 separate data sets while the 
dashed line represents the average 
value. 



Figure 4. Percent of ideal profits 
obtained by designs chosen using 
estimated models, but optimiz¬ 
ing prices for these designs with 
knowledge of the true choice be¬ 
havior. Note the log 10 scale y-axis 
focuses on differences from 100%. 
Solid lines represent the range of 
observed values over 20 separate 
data sets while the dashed line rep¬ 
resents the average value. 
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preferences exactly when actually offering the vehicles designed (Fig. 4). This is practically 
impossible but does suggest that a significant portion of the “error” made with conventional 
models pertains to pricing bias, not design bias. However designers should be aware of the 
possible side effects of different compensatory model structures: The MNL model suggests 
portfolios with identical body styles; the RCL model, estimated on limited amounts of 
aggregate market share data, is highly sensitive to sample error leading to large variations 
in optimal designs; and the NML model, while it might capture optimal designs very well 
if the nesting structure reflects consideration, suggests biased pricing decisions and thus 
cannot present accurate forecasts of design profitability. Designers also need to take into 
account the amount of information available to train their model when they decide what 
model to use. According to our simulation using the MNL might be more reliably profitable 
than using the RCL and NML models if market data are very limited, because noise in the 
data induces greater variance in designs suggested by RCL and NML models. 

Second, modeling the heterogeneity in the screening rules may capture more value to 
design than modeling heterogeneity in the compensatory stage. This is most directly 
observed by comparing the CTC and RCL models. The RCL model ignores screening 
stage heterogeneity, and achieved only 30% of the ideal profit (on average) with 10 markets 
while displaying an unacceptably large sensitivity to sample variance with limited training 
data. The CTC model with only 10 markets of training data gives a firm expected profit 
that is at least 80% as much as what they could get with perfect knowledge. Recall 
that the CTC model is mis-specified, in that it ignores compensatory stage heterogeneity. 
Other evidence comes from the NML. NML and CTC are similar in a two-stage modeling 
structure. In effect, our NML is a close approximation to a CTC model assuming that 
individuals consider one, and only one, vehicle body style. Decisions made based on the 
NML are most often more profitably than those made with the RCL model for all amounts 
of training data. This observation may be driven by limited amount of heterogeneity in 
our assumed true behavior (see Table 3), suggesting further research is required. 

Note also that the CTC model has the potential to be seriously overfit. For example, the 
CTC model in with M = 50 has more than twice as many parameters (515) as observations 
(250), but is still the most predictively accurate model (Fig. 1, right) and results in the 
most profitable decisions (Fig. 3, right). Conventional wisdom would suggest that at least 
as many observations as parameters are required for a valid model; i.e., the CTC model 
requires, at a minimum, M 103 markets with 5 vehicles per market. We believe that the 
stability of estimated model predictions is more important than the ratio of parameters to 
observations; Fig. 1 shows that the predictive power of the estimated CTC model is as 
good as it can get with as few as 50 markets. However, overfitting effects may result in 
the difference between NML and CTC model performance when pricing on offering (Fig. 
4): our CTC model presumes that all 511 screening rules may be in use by the population 
generating the data, while the NML approximates a CTC model with only B = 9 screening 
rules. Slightly better performance with the simpler model suggests some overfitting may 
be occurring, although any such overfitting could be easily corrected by restricting the 
number of nonzero a coefficients in the CTC estimation. 
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Figure 5. Expected design error (left) and expected Profit Error (right) 
versus Kullback-Leibler divergence of choice prediction. Completely recov¬ 
ery of the ideal profit yields 0% error while zero profit yields 100% error. 


Third, assuming that better predictive power indicates better design decisions is reason¬ 
able but not necessarily true. Pearson correlation coefficients are positive but weak: 0.62 
between predictive power (divergence) and design error, 0.73 between predictive power and 
profit error (measured as error, not percent of ideal profits recovered), and 0.73 between 
design and profit errors. Fig. 5 scatters the average design error and average true prof¬ 
itability versus the average Kullback-Leibler divergence of four models estimated under 
two market information conditions: 10 and 1,000 markets; Here, the average is taken over 
different data sets with the same number of markets. While there is a general trend that 
lower divergence (better model predictive power) is consistent with lower design error, de¬ 
viation from this trend is also observed. For example, NML has, on average, worse choice 
predictions but better designs than RCL for both 10 and 1,000 markets worth of data. 
Lower model divergence also generally indicates less loss of profit. However there are ex¬ 
ceptions, such as the comparison between the NML and the RCL. Note also the difference 
in scales: the MNL model does not appear to predict that much worse than the NML or 
RCL models while suggesting designs that capture almost no profit relative to NML or 
RCL. 

These results shows that the true profitability of designs made using traditional models 
cannot be judged from predictive power alone. While further investigation of the relation¬ 
ship between predictive accuracy and decision value across a range of design problems and 
market conditions is needed, it is clear that choice models with structure representative of 
the underlying choice process are better for design even if they may not show significant 
benefits from the perspective of modeling choice alone. 

Important limitations of our study are as follows. 

Our study has focused on the value of incorporating prior knowledge on screening with¬ 
out demonstrating the process needed to obtain that knowledge. Nor has our study exam¬ 
ined the consequences of misspecified prior knowledge. The CTC model in this study is 
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able to estimate the distribution of the possible consideration sets from choice data given 
that the attributes involved in the screening process are known and limited. Our presumed 
behavior—screening over body style—is a reasonable prior for the case study and is rep¬ 
resented in some form in every model we tested. Aggregate share data is insufficient to 
infer what attributes and screens are involved in the consideration stage. We are currently 
mirroring this simulation study within the context of survey design for both choice-based 
conjoint and consideration-based questions [32] in which screening rules can be statistically 
inferred. Subjective beliefs, however, often inform choice model construction; they under¬ 
lie decisions about what utility function to use and what distribution the error term takes 
(including heterogeneous preferences and nesting structures). While we must assume that 
misspecification of screening rules would impact design outcomes, this is a generic problem 
for choice modeling. 

The design problem in our case study is also simplistic. The engineering model merely 
focuses on a body style specific fuel consumption-acceleration relationship and cost function 
that depends only on fuel economy and acceleration. The screening rules we used were 
independent of continuous features such as price and fuel economy; in contrast, the study 
from which we drew screening rules estimated rules over a body style, brand, fuel economy, 
price, quality, safety, power, and powertrain [32]. Future studies should include more 
engineering model as well as complexities in screening. 

Bayesian methods [66] for model estimation were also not used in this study. An impor¬ 
tant fact is that Bayesian methods provide an alternative path to estimate the parameters 
of a choice model, not fundamentally different models. Theoretically speaking, maximum 
likelihood and Bayesian estimators are often similar; in particular the posterior mean of 
a Bayesian estimator is asymptotically equivalent to the maximum likelihood estimator 
[2], Empirically Bayesian estimators have been reported to have better fit small-sample 
data but have predictive power and parameter recovery similar to maximum likelihood 
estimation [67]. In the context of our study we might then expect Bayesian estimators 
to achieve larger-sample performance with fewer data, but not to qualitatively change 
the comparison between conventional compensatory models and non-compensatory mod¬ 
els when consumers consider. 


8. CONCLUSIONS 

This paper explores the impact of consideration behavior on optimal design for market 
systems models. We presented a simulation study of vehicle portfolio design for a popula¬ 
tion with heterogeneous screening over body style and heterogenous compensatory evalua¬ 
tions after screening. With synthetically generated aggregate marketshare data we estimate 
multinomial Logit, random coefficient Logit, nested multinomial logit, and consider-then- 
choose logit models. All four models contain some representation of screening, and all are 
misspecified in at least one dimension of the true behavior. We use the estimated models to 
optimize designs for a single model and compare model performance in terms of predictive 
power, design error, and profitability. We find that capturing heterogeneous consideration, 
when it exists, is more important than capturing heterogeneous tradeoffs. This can be 
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accomplished with consider-then-choose Logit, but also with the right nested Logit model. 
Decisions made using Logit models are simplistic, suggesting portfolios with a single body 
style, and decisions made using random coefficients Logit models are noisy; with limited 
amounts of data, Logit models may often lead to more profitable decisions than random 
coefficients Logit. We also observe that higher model predictive power generally does im¬ 
ply a more profitable design decision, but that there are cases where poorer predictors can 
yield higher profits. 
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